Segmentation of Unstructured Newspaper Documents

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic Newspaper Page Segmentation

The aim of layout analysis is to extract the geometric structure from a document image. It consists of labeling homogenous regions of a document image. This paper describes the performance of segmentation algorithms and their adaptation in order to treat complex structured Arabic documents such as newspapers. Experimental tests have been carried out on four different phases of newspaper image a...

متن کامل

Anonimytext: Anonimization of Unstructured Documents

The anonymization of unstructured texts is nowadays a task of great importance in several text mining applications. Medical records anonymization is needed both to preserve personal health information privacy and enable further data mining efforts. The described ANONYMITEXT system is designed to de identify sensible data from unstructured documents. It has been applied to Spanish clinical notes...

متن کامل

Ontology-Based Semantic Classification of Unstructured Documents

As more and more knowledge and information becomes available through computers, a critical capability of systems supporting knowledge management is the classification of documents into categories that are meaningful to the user. In a step beyond the use of keywords, we developed a system that analyzes the sentences contained in unstructured or semi-structured documents, and utilizes an ontology...

متن کامل

Segmentation of Compressed Documents

We present a novel technique for segmentation of a JPEGcompressed document based on block activity. The activity is measured as the number of bits spent to encode each block. Each number is mapped to a pixel brightness value in an auxiliary image which is then used for segmentation. We introduce the use of such an image and show an example of a simple segmentation algorithm, which was successfu...

متن کامل

Features for Neural Net Based Region Identification of Newspaper Documents

Several features for Neural Network based document region identification are tested. Specifically, this paper examines features for non-text region identification. The Neural Network based region identification algorithm is a key component of a document recognition system that segments a document into regions, classifies them into text, graphic, photo, and other region types, and then uses this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Advanced Engineering Research and Science

سال: 2017

ISSN: 2349-6495,2456-1908

DOI: 10.22161/ijaers.4.5.13